jfblog

Random logs and notes

Convert xls files to csv

I had to convert a bunch of xlsx files to csv. A simple solution is obtained thanks to the libraries in http://www.python-excel.org/. More specifically, I only had to use xlrd. It is as simple as

wb = xlrd.open_workbook(file)
sh = wb.sheet_by_name('Sheet1')

Now the sh object contains the first sheet and we can use methods to read the content. The rest of the work consists in opening a new file for writing and do the actual writing, thanks to the csv library. I also use the argparse library to parse scripts arguments.

In [3]:
%%file xls2csv.py 
import xlrd
import csv
import glob
import os
import sys

def csv_from_excel(file):

    wb = xlrd.open_workbook(file)
    sheets=wb.sheet_names()
    sh = wb.sheet_by_name(sheets[0])
              
    if sys.version_info >= (3,0,0):
        your_csv_file = open(file.split('.xls')[0] + '.csv', 'w', newline='')
        #Here the split operation works for both xls and xlsx 
        #extensions, since we only keeo the [0] part
    else:
        your_csv_file = open(file.split('.xls')[0] + '.csv', 'wb')
    wr = csv.writer(your_csv_file, delimiter='\t', quoting=csv.QUOTE_NONE)

    for rownum in range(sh.nrows):
        wr.writerow(sh.row_values(rownum))

    your_csv_file.close()
	

verbose=True    
if __name__ == '__main__':	
    import argparse

    whatitdoes="This program converts a series of xls[x] files into csv."
    myself="(c) JFB 2014"
    parser = argparse.ArgumentParser(description=whatitdoes, epilog=myself)
    # mandatory argument
    parser.add_argument(
    help = 'List of files to convert (accepts regular expressions)',
    dest = 'argfiles', default = '*.xls*', type = str,  nargs = '*')
    # verbosity flag
    parser.add_argument('-v','--verbose', help = 'Prints information',
    dest = 'verbose', default = False,   #action='store_true'
    action='count')

    arguments = parser.parse_args()
    verbose=arguments.verbose
    if verbose==2: print("script arg: ", arguments.argfiles)
    xlsx_files = glob.glob(arguments.argfiles[0])
    if verbose==2: print("glog.glog expansion: ", xlsx_files, '\n')
    if len(xlsx_files) == 0:
        raise RuntimeError('No XLSX files to convert.')
          
    for file in xlsx_files:
        if verbose:
            print("Converting {}".format(file))
        csv_from_excel(file)
Writing xls2csv.py

And now let us test it:

In [7]:
ls Notes*.xls
NotesB1ST02-ISBS1-2012.xls     NotesB2ST12- AdaEtClassif-ISBS2-2013-14 (1).xls  NotesB2ST12-ISBS2-2013-14.xls
NotesB2ST02-ISBS2-2013-14.xls  NotesB2ST12- AdaEtClassif-ISBS2-2013-14.xls

In [8]:
%run xls2csv -h
usage: xls2csv.py [-h] [-v] [argfiles [argfiles ...]]

This program converts a series of xls[x] files into csv.

positional arguments:
  argfiles       List of files to convert (accepts regular expressions)

optional arguments:
  -h, --help     show this help message and exit
  -v, --verbose  Prints information

(c) JFB 2014

In [9]:
%run xls2csv -vv "Note\*.xls"
script arg:  ['Note*.xls']
glog.glog expansion:  ['NotesB2ST12- AdaEtClassif-ISBS2-2013-14 (1).xls', 'NotesB2ST12-ISBS2-2013-14.xls', 'NotesB2ST02-ISBS2-2013-14.xls', 'NotesB2ST12- AdaEtClassif-ISBS2-2013-14.xls', 'NotesB1ST02-ISBS1-2012.xls'] 

Converting NotesB2ST12- AdaEtClassif-ISBS2-2013-14 (1).xls
Converting NotesB2ST12-ISBS2-2013-14.xls
Converting NotesB2ST02-ISBS2-2013-14.xls
Converting NotesB2ST12- AdaEtClassif-ISBS2-2013-14.xls
Converting NotesB1ST02-ISBS1-2012.xls

Running this notebook will produce the script xls2csv.py. Otherwise, this can be also downloaded here.

In [11]:
HTML(the_end(theNotebook))
Out[11]:

This post was written as an IPython notebook. It is available for download or as a static html.

Creative Commons License
jfblog by J.-F. Bercher is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Based on a work at http://jfbercher.github.io/.

misc xls

Comments